PLCFRS Parsing Revisited: Restricting the Fan-Out to Two

نویسندگان

  • Wolfgang Maier
  • Miriam Kaeshammer
  • Laura Kallmeyer
چکیده

Linear Context-Free Rewriting System (LCFRS) is an extension of Context-Free Grammar (CFG) in which a non-terminal can dominate more than a single continuous span of terminals. Probabilistic LCFRS have recently successfully been used for the direct data-driven parsing of discontinuous structures. In this paper we present a parser for binary PLCFRS of fan-out two, together with a novel monotonous estimate for A∗ parsing, with which we conduct experiments on modified versions of the German NeGra treebank and the Discontinuous Penn Treebank in which all trees have block degree two. The experiments show that compared to previous work, our approach provides an enormous speed-up while delivering an output of comparable richness.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

PLCFRS Parsing of English Discontinuous Constituents

This paper proposes a direct parsing of non-local dependencies in English. To this end, we use probabilistic linear context-free rewriting systems for data-driven parsing, following recent work on parsing German. In order to do so, we first perform a transformation of the Penn Treebank annotation of non-local dependencies into an annotation using crossing branches. The resulting treebank can be...

متن کامل

Data-driven Parsing using PLCFRS Data-driven Parsing using Probabilistic Linear Context-Free Rewriting Systems

This paper presents the first efficient implementation of a weighted deductive CYK parser for Probabilistic Linear Context-Free Rewriting Systems (PLCFRS). LCFRS, an extension of CFG, can describe discontinuities in a straightforward way and is therefore a natural candidate to be used for data-driven parsing. To speed up parsing, we use different context-summary estimates of parse items, some o...

متن کامل

Optimal Parsing Strategies for Linear Context-Free Rewriting Systems

Reduction is the operation of transforming a production in a Linear Context-Free Rewriting System (LCFRS) into two simpler productions by factoring out a subset of the nonterminals on the production’s righthand side. Reduction lowers the rank of a production but may increase its fan-out. We show how to apply reduction in order to minimize the parsing complexity of the resulting grammar, and stu...

متن کامل

Data-Driven Parsing with Probabilistic Linear Context-Free Rewriting Systems

This paper presents a first efficient implementation of a weighted deductive CYK parser for Probabilistic Linear ContextFree Rewriting Systems (PLCFRS), together with context-summary estimates for parse items used to speed up parsing. LCFRS, an extension of CFG, can describe discontinuities both in constituency and dependency structures in a straightforward way and is therefore a natural candid...

متن کامل

Optimal Rank Reduction for Linear Context-Free Rewriting Systems with Fan-Out Two

Linear Context-Free Rewriting Systems (LCFRSs) are a grammar formalism capable of modeling discontinuous phrases. Many parsing applications use LCFRSs where the fan-out (a measure of the discontinuity of phrases) does not exceed 2. We present an efficient algorithm for optimal reduction of the length of production right-hand side in LCFRSs with fan-out at most 2. This results in asymptotical ru...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012